SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: SPEAR, Patricia G. 

MONTGOMERY, Rebecca I. 

(ii) TITLE OF INVENTION: HERPES VIRUS ENTRY RECEPTOR PROTEIN 

(iii) NUMBER OF SEQUENCES: 7 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: DRESSLER, GOLDSMITH, SHORE & UILNAMOW 
<B) STREET: 180 N. STETSON, SUITE 4700 

(C) CITY: CHICAGO 

(D) STATE: ILLINOIS 

(E) COUNTRY: U.S.A 

(F) ZIP: 60601 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC- DOS /MS- DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(Viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: NORTHRUP, THOMAS E. 

(B) REGISTRATION NUMBER: 33,268 

(C) REFERENCE / DOCKET NUMBER: XX 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (312) 616-5400 

(B) TELEFAX: (312) 616-5460 

(C) TELEX: XX 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1719 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: aatpeptide 

(B) LOCATION: 293.. 1189 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 293.. 11 92 

(ix) FEATURE: 

(A) NAME /KEY: sigjpeptide 

(B) LOCATION: 293.. 406 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1 : 
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CCTTCATACC GGCCCTTCCC CTCGGCTTTG CCTGGACAGC TCTGCCTCCC GCAGGGCCCA 60 

CCTQTGTCCC CCAGCGCCGC TCCACCCAGC AGGCCTCAGC CCCTCTCTGC TGCCAGACAC 120 

CCCCTGCTGC CCACTCTCCT GCTGCTCGGG TTCTGAGGCA CAGCTTGTCA CACCGAGGCG 180 

GATTCTCTTT CTCTTTCTCT TCTGGCCCAC AGCCGCAGCA ATGGCGCTGA GTTCCTCTGC 240 

TGGAGTTCAT GCTGCTAGCT GGGTTCCCGA GCTGCCGGTC TGAGCCTGAG GC ATG 295 
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GAG CCT OCT GGA GAG TGG GGG CCT CCT CCC TGG AGA TCC ACC CCC AGA 



fK Si SS £y JSp Trp Gly Pro Pro ^ T<i XFg Sen Thr Pro Arg 
5 10 10 

ACC GAC GTC TTG AGG CTG GTG CTG TAT CTC ACC TTC CTG GGA GCC CCC 
tSS fe£ vli lIS Arg Leu Val Leu Tyr. Leu Thr Phe Leu Gly Ala Pro 
20 25 

Trr TAr GCC CCA GCT CTG CCG TCC TGC AAG GAG GAC GAG TAC CCA GTG 
lyt Tyr SS p£ Ma Su Pro Ser Cys Lys Glu Asp Glu Tyr Pro Val 
35 40 A 

i| is s: ss 55 s «-s « s 1 ™ s as s a 

S 5S Ifv iiS ffl « S « as 1 Si - « - ™ ® 

70 75 

^ # ffi S SS £5 SK IS S 85 $ E 2S 5g S5H 

3 85 90 

ATG TGT GAC CCA GCC ATG GGC CTG CGC GCG ACQ CGG AAC TGC TCC AGG 
Met Cys Asp Pro Ala Met Gly Leu Arg Ala Thr Arg Asn Cys Ser Arg 
100 105 

^ ^ S as £ 8S 5S SS Si S5 Sg KS 5S S2 52 

115 120 1iJ 

r A r r&r GGG GAC CAC TGC GCC GGT GCC GCC GTT ACG CCA CCT CCA GCC 
Asp Gly Asp S£ Si Ala Gly Ala Ala Val Thr Pro Pro Pro Ala 
130 * 35 ' 

rrr nrr AGA GGG TGC AGA AGG GAG GCA CCG AGA GTC AGG ACA CCC TGT 

S3 Ala A?S Gly eg Arg Ar„ Glu Ala Pro Arg Val Arg Thr Pro Cy. 

as a ffi fa s s a? a ss 35 ss 9 « g a as 

SS £ 8S 3 « $ 5! 3 5! S - « £ S ® ® 

180 185 

if y ffi A 4°r « e a T ?p sa ss s: je g »' ss « a 

195 200 205 

ass2asffias5!ss«aisgffiS5Jas2 

210 215 220 

AGA AGA AAG CCA AGG GGT GAT GTA GTC AAG GTG ATC GTC TCC GTC CAG 



343 



391 



439 



487 



535 



583 



631 



679 



727 



775 



823 



871 



919 



967 



1015 
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Arg Arg Lys Pro Arg Gly Asp Val Val Lys Val lie Val Ser Val Gin 
230 235 240 

CGG AAA AGA CAG GAG GCA GAA GGT GAG GCC ACA GTC ATT GAG GCC CTG 1063 
Arg Lys Arg Gin Glu Ala Glu Sly Glu Ala Thr Val lie Glu Ala Leu 
245 250 255 

CAG GCC CCT CCG GAC GTC ACC ACQ GTG GCC GTG.AGG AGA CAA TAC CCT 1111 
Gin Ala Pro Pro Asp Val Thr Thr Val Ala Val Arg Arg Gin Tyr Pro 
260 265 270 

CAT TCA CGG GGA GGA GCC CAA ACC ACT -GAC CCA CAG ACT CTG CAC CCC 1159 
His Ser Arg Gly Gly Ala Gin Thr Thr Asp Pro Gin Thr Leu His Pro 
275 280 285 

GAC GCC AGA GAT ACC TGG AGC GAC GGC TGC TGA AAGAGGCTGT CCACCTGGCG 1212 
Asp Ala Arg Asp Thr Trp Ser Asp Gly Cys * 
290 295 300 

AAACCACCGG AGCCCGGAGG CTTGGGGGCT CCGCCCTGGG CTGGCTTCCG TCTCCTCCAG 1272 

TGGAGGGAGA GGTGGGGCCC CTGCTGGGGT AGAGCTGGGG ACGCCACGTG CCATTCCCAT 1332 

GGGCCAGTGA GGGCCTGGGG CCTCTGTTCT GCTGTGGCCT GAGCTCCCCA GAGTCCTGAG 1392 

GAGGAGCGCC AGTTGCCCCT CGCTCACAGA CCACACACCC AGCCCTCCTG GGCCAGCCCA 1452 

GAGGGCCCTT CAGACCCCAG CTGTCTGCGC GTCTGACTCT TGTGGCCTCA GCAGGACAGG 1512 

CCCCGGGCAC TGCCTCACAG CCAAGGCTGG ACTGGSTTGG CTGCAGTGTG GTGTTTAGTG 1572 

GATACCACAT CGGAAGTGAT TTTCTAAATT GGATTT6AAT TCCGGTCCTG TCTTCTATTT 1632 

GTCATGAAAC AGTGTATTTG GGGAGATGCT GTGGGAGGAT GTAAATATCT TGTTTCTCCT 1692 

CAAAAAAAAA AAAAAAAAAA AAAAAAA 1719 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 299 anino acids 

(B) TYPE: asino acid . . 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

Met Glu Pro Pro Gly Asp Trp Gly Pro Pro Pro Trp Arg Ser Thr Pro 
1 5 10 15 

Arg Thr Asp Val Leu Arg Leu Val Leu Tyr Leu Thr Phe Leu Gly Ala 
20 25 30 

Pro Cys Tyr Ala Pro Ala Leu Pro Ser Cys Lys Glu Asp Glu Tyr Pro 
35 40 45 

Val Gly Ser Glu Cys Cys Pro Thr Cys Ser Pro Gly Tyr Arg Val Lys 
50 55 60 

Glu Ala Cys Gly Glu Leu Thr Gly Thr Val Cys Glu Pro Cys Pro Pro 
65 70 75 80 

Gly Thr Tyr He Ala His Leu Asn Gly Leu Ser Lys Cys Leu Gin Cys 
85 J 90 95 

Gin Met Cys Asp Pro Ala Met Gly Leu Arg Ala Thr Arg Asn Cys Ser 
100 105 * 110 

m 



Arg Thr Glu Asn Ala Val Cys Gly Cys Ser Pro Gly His Phe Cys lie 
115 120 125 

Val Gin Asp Gly Asp His Cys Ala Gly Ala Ala Val Thr Pro Pro Pro 
130 135 140 

Ala Arg Ala Arg Gly Cys Arg Arg Glu Ala Pro Arg Val Arg Thr Pro 
145 150 155 160 

Cys Val Arg Thr Ala Pro Gly Asp Leu Leu Ser Asn Gly Thr Leu Glu 
165 170 " 175 

Glu Cys Gin His Gin Thr Lys Cys Ser Trp Leu Val Thr Lys Ala Gly 
180 185 190 

Ala Gly Thr Ser Ser Ser His Trp Val Trp Trp Phe Leu Ser Gly Ser 
195 200 205 

Leu Val He Val He Val Cys Ser Thr Val Gly Leu He He Cys Val 
210 215 220 

Lys Arg Arg Lys Pro Arg Gly Asp Val Val Lys Val He Val Ser Val 
225 * 230 235 240 

Gin Arg Lys Arg Gin Glu Ala Glu Gly Glu Ala Thr Val He Glu Ala 
245 250 255 

Leu Gin Ala Pro Pro Asp Val Thr Thr Val Ala Val Arg Arg Gin Tyr 
260 265 270 

Pro His Ser Arg Gly Gly Ala Gin Thr Thr Asp Pro Gin Thr Leu His 
275 280 285 

Pro Asp Ala Arg Asp Thr Trp Ser Asp Gly Cys 
290 ~ 295 

(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 

AACCCGGCTC GAGCGGCCGC T 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA • 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 
GAATTCCACC ACACTTAAGG TG 

(2) INFORMATION FOR SEQ ID N0:5: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ACAAGACCGT TGCACCCTC 

(2) INFORMATION FOR SEQ ID NO:6: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4619 base pairs 

(B) TYPE: nucleic acid 



jCj STRANDEONESS: .single 



TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 64.. 1320 

(ix) FEATURE: 

(A) NAME/KEY: natpeptide 

(B) LOCATION: 64.. 1317 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AAGCTTGCAT GCCTGCAGGT CGACTCTAGC TGGGTTCCCG AGCTGCCGGT CTGAGCCTGA 

GGC ATG GAG CCT CCT GGA GAC TGG GGG CCT CCT CCC TGG AGA TCC ACC 
Met Glu Pro Pro Gly Asp Trp Gly Pro Pro Pro Trp Arg Ser Thr 
1 5 10 15 

CCC AGA ACC GAC GTC TTG AGG CTG GTG CTG TAT CTC ACC TTC CTG GGA 
Pro Arg Thr Asp Val Leu Arg Leu Val Leu Tyr Leu Thr Phe Leu Gly 

20 . 25 30 

GCC CCC TGC TAC GCC CCA GCT CTG CCG TCC TGC AAG GAG GAC GAG TAC 
Ala Pro Cys Tyr Ala Pro Ala Leu Pro Ser Cys Lys Glu Asp Glu Tyr 
.35 40 45 

CCA GTG GGC TCC GAG TGC TGC CCC ACG TGC AGT CCA GGT TAT CGT GTG 
Pro Val Gly Ser Glu Cys Cys Pro Thr Cys Ser Pro Gly Tyr Arg Val 
50 55 60 

AAG GAG GCC TGC GGG GAG CTG ACG GGC ACA GTG TGT GAA CCC TGC CCT 
Lys Glu Ala Cys Gly Glu Leu Thr Gly Thr Val Cys Glu Pro Cys Pro 
65 70 75 

CCA GGC ACC TAC ATT GCC CAC CTC AAT GGC CTA AGC AAG TGT CTG CAG 
Pro Gly Thr Tyr He Ala His Leu Asn Gly Leu Ser Lys Cys Leu Gin 
80 85 90 95 

TGC CAA ATG TGT GAC CCA GCC ATG GGC CTG CGC GCG ACG CGG AAC TGC 
Cys Gin Met Cys Asp Pro Ala Met Gly Leu Arg Ala Thr Arg Asn Cys 
100 105. 110 

TCC AGG ACA GAG AAC GCC GTG TGT GGC TGC AGC CCA GGC CAC TTC TGC 
Ser Arg Thr Glu Asn Ala Val Cys Gly Cys Ser Pro Gly His Phe Cys 
115 120 125 

ATC GTC CAG GAC GGG GAC CAC TGC GCC GGT GCC GCC GTT ACG CCA CCT 
He Val Gin Asp Gly Asp His Cys^Ala Gly Ala Ala Val Thr Pro Pro 
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CCA GCC CGG GCC AGA GGG TGC AGA AGG GAG GCA CCG AGA GTC AGG ACA 
Pro Ala Arg Ala Arg Gly Cys Arg Arg Glu Ala Pro Arg Val Arg Th? 
145 150 155 

p™ ?SI SI? A6A ST 55° S 00 ^ CTT CTC TCC AAT GGG ACC CTG 
Pro Cys Val Arg Thr Ala Pro Gly Asp Leu Leu Ser Asn Gly Thr Leu 
160 165 170 J 175 

tffi rSI rin SJ S ?S $?- f AG J GC AGA ATT CAC AAG ACC GTT GCA 
Glu Glu Cys Gin His Gin Thr Lys Cys Arg He His Lys Thr Val Ala 

1 80 185 190 

22 I GC A6C ^ G 000 ACS TGC CCA CCC CCT GAA CTC CTG GGG 
Pro Ser Thr Cys Ser Lys Pro Thr Cys Pro Pro Pro Glu Leu Leu Gly 
195 200 205 

r?£ £S2 I CT S T ? TP A T C TP £ CC CCA AAA CCC AAG GAC ACC CTC ATG 
Gly Pro Ser Val Phe He Phe Pro Pro Lys Pro Lys Asp Thr Leu Met 
210 215 220 

ATC TCA CGC ACC CCC GAG GTC ACA TGC GTG GTG GTG GAC GTG AGC CAG 
lie Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser Gin 
225 230 235 

GAT GAC CCC GAG GTG CAG TTC ACA TGG TAC ATA AAC AAC GAG CAG GTG 
Asp Asp Pro Glu Val Gin Phe Thr Trp Tyr He Asn Asn Glu Gin Val 
240 245 250 255 

™ C ?? C 06(3 S 06 S 06 CTA 06(3 SAG CAG CAG TTC AAC AGC ACQ ATC 
Arg Thr Ala Arg Pro Pro Leu Arg Glu Gin Gin Phe Asn Ser Thr He 
260 265 270 

!S£ SI? SI? £ GC £5° F TC 2°° A T C £9° CAC CAG GAC TGG CTG AGG GGC 
Arg Val Val Ser Thr Leu Pro He Thr His Gin Asp Trp Leu Arg Gly 
275 280 285 

ft G TP f* 6 I GC ^ GTC CAC AAC AAG GCA CTC CCG GCC CCC ATC 
Lys Glu Phe Lys Cys Lys Val His Asn Lys Ala Leu Pro Ala Pro lie 
290 295 300 

GAG AAA ACC ATC TCC AAA GCC AGA GGG CAG CCC CTG GAG CCG AAG GTC 
Glu Lys Thr He Ser Lys Ala Arg Gly Gin Pro Leu Glu Pro Lys Val 
305 310 315 * 

ThS SIS ??° S CT °P° S 66 ^ G S* 6 CTG AGC AGC AGG TGG GTC AGC 
Tyr Thr Met Gly Pro Pro Arg Glu Glu Leu Ser Ser Arg Ser Val Ser 
320 325 330 335 

fin Jt C S TG A T° AAC ?? C IP I AC 007 TCC GAC ATC TCG GTG GAG 
Leu Thr Cys Met He Asn Gly Phe Tyr Pro Ser Asp He Ser Val Glu 
340 345 r 350 

T™ £t G ff 6 AAC 9^ ^ G GCA 9* G AAC TAC AAG ACC ACQ CCG GCC 
Trp Glu Lys Asn Gly Lys Ala Glu Asp Asn Tyr Lys Thr Thr Pro Ala 
355 360 365 

SI? £2 ft 0 c GC f C I CC J AC TP CTC TAC AAC AAG CTC TCA GTG 
Val Leu Asp Ser Asp Gly Ser Tyr Phe Leu Tyr Asn Lys Leu Ser Val 
370 375 380 

CCC ACQ AGT GAG TGG CAG CGG GGC GAC GTC TTC ACC TGC TCC GTG ATG 
Pro Jhr Ser Glu Trp Gin Arg Gly Asp Val Phe Thr Cys Ser Val Met 
385 390 395 

CAC GAG GCC TTG CAC AAC CAC TAC ACQ CAG AAG TCC ATC TCC CGC TCT 
His Glu Ala Leu His Asn His Tyr Thr Gin Lys Ser He Ser Ara Ser 
400 405 410 415 
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CCG GGT AAA TGA GCGCTGTGCC GGCGAGCTGC CCCTCTCCCT CCCCCCCACG 1360 
Pro Gly Lys * 

CCGCAGCTGT GCACCCCGCA CACAAATAAA GCACCCAGCT CTGCCCTGAA CAGCTTCCGG 1420 

TCTCCCTATA GTGAGTCGTA TTAATTTCGA TAAGCCAGCT GCATTAATGA ATCGGCCAAC 1480 

GCGCGGGGAG AGGCGGTTTG CGTATTGGGC GCTCTTCCGC TTCCTCGCTC ACTGACTCGC 1540 

TGCGCTCGGT CGTTCGGCTG CGGCGAGCGG TATCAGCTCA CTCAAAGGCG GTAATACGGT 1600 

TATCCACAGA ATCAGGGGAT AACGCAGGAA AGAACATGTG AGCAAAAGGC CAGCAAAAGG 1660 

CCAGGAACCG TAAAAAGGCC GCGTTGCTGG CG I I I r ! CCA TAGGCTCCGC CCCCCTGACG 1720 

AGCATCACAA AAATCGACGC TCAAGTCAGA GGTGGCGAAA CCCGACAGGA CTATAAAGAT 1780 

ACCAGGCGTT TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC TGTTCCGACC CTGCCGCTTA 1840 

CCGGATACCT GTCCGCCTTT CTCCCTTCGG GAAGCGTGGC GCTTTCTCAT AGCTCACGCT 1900 

GTAGGTATCT CAGTTCGGTG TAGGTCGTTC GCTCCAAGCT GGGCTGTGTG CACGAACCCC 1960 

CCGTTCAGCC CGACCGCTGC GCCTTATCCG GTAACTATCG TCTTGAGTCC AACCCGGTAA 2020 

GACACGACTT ATCGCCACTG GCAGCAGCCA CTGGTAACAG GATTAGCAGA GCGAGGTATG 2080 

TAGGCGGTGC TACAGAGTTC TTGAAGTGGT GGCCTAACTA CGGCTACACT AGAAGGACAG 2140 

TATTTGGTAT CTGCGCTCTG CTGAAGCCAG TTACCTTCGG AAAAAGAGTT GGTAGCTCTT 2200 

GATCCGGCAA ACAAACCACC GCTGGTAGCG GTGG I I I I I I TGTTTGCAAG CAGCAGATTA 2260 

CGCGCAGAAA AAAAGGATCT CAAGAAGATC CTTTGATCTT TTCTACGGGG TCTGACGCTC 2320 

AGTGGAACGA AAACTCACGT TAAGGGATTT TGGTCATGAG ATTATCAAAA AGGATCTTCA 2380 

CCTAGATCCT TTTAAATTAA AAATGAAGTT TTAAATCAAT CTAAAGTATA TATGAGTAAA 2440 

CTTGGTCTGA CAGTTACCAA TGCTTAATCA GTGAGGCACC TATCTCAGCG ATCTGTCTAT 2500 

TTCGTTCATC CATAGTT6GC TGACTCCCCG TCGTGTAGAT AACTACGATA CGGGAGGGCT 2560 

TACCATCTGG CCCCAGTGCT GCAATGATAC CGCGAGACCC ACGCTCACCG GCTCCAGATT 2620 

TATCAGCAAT AAACCAGCCA GCCGGAAGGG CCGAGCGCAG AAGTGGTCCT GCAACTTTAT 2680 
CCGCCTCCAT CCAGTCTATT AATTGTTGCC GGGAAGCTAG AGTAAGTAGT TCGCCAGTTA . 2740 

ATAGTTTGCG CAACGTTGTT GCCATTGCTA CAGGCATCGT GGTGTCACGC TCGTCGTTTG 2800 

GTATGGCTTC ATTCAGCTCC GGTTCCCAAC GATCAAGGCG AGTTACATGA TCCCCCATGT 2860 

TGTGCAAAAA AGCGGTTAGC TCCTTCGGTC CTCCGATCGT TGTCAGAAGT AAGTTGGCCG 2920 
CAGTGTTATC ACTCATGGTT ATGGCAGCAC TGCATAATTC TCTTACTGTC ATGCCATCCG 2980 

TAAGATGCTT TTCTGTGACT GGTGAGTACT CAACCAAGTC ATTCTGAGAA TAGTGTATGC .3040 
GGCGACCGAG TTGCTCTTGC CCGGCGTCAA TACGGGATAA TACCGCGCCA CATAGCAGAA 3100 
CTTTAAAAGT GCTCATCATT GGAAAACGTT CTTCGGGGCG AAAACTCTCA AGGATCTTAC 3160 
CGCTGTTGAG ATCCAGTTCG ATGTAACCCA CTCGTGCACC CAACTGATCT TCAGCATCTT 3220 
TTACTTTCAC CAGCGTTTCT GGGTGAGCAA AAACAGGAAG GCAAAATGCC GCAAAAAAGG 3280 
GAATAAGGGC GACACGGAAA TGTTGAATAC TCATACTCTT CCTTTTTCAA TATTATTGAA 3340 
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GCATTTATCA GGGTTATTGT CTCATGAGCG GATACATATT TGAATGTATT TAGAAAAATA 3400 

AACAAATAGG GGTTCCGCGC ACATTTCCCC GAAAAGTGCC ACCTGACGTC TAAGAAACCA 3460 

TTATTATCAT GACATTAACC TATAAAAATA GGCGTATCAC GAGGCCCTTT CGTCTCGCGC 3520 

GTTTCGGTGA TGACGGTGAA AACCTCTGAC ACATGCAGCT CCCGGAGACG GTCACAGCTT 3580 

GTCTGTAAGC GGATGCCGGG AGCAGACAAG CCCGTCAGGG CGCGTCAGCG GGTGTTGGCG 3640 

GGTGTCGGGG CTGGCTTAAC TATGCGGCAT CAGAGCAGAT TGTACTGAGA GTGCACCATA 3700 

TCGACGCTCT CCCTTATGCG ACTCCTGCAT TAGGAAGCAG CCCAGTAGTA GGTTGAGGCC 3760 

GTTGAGCACC GCCGCCGCAA GGAATGGTGC AAGGAGATGG CGCCCAACAG TCCCCCGGCC 3820 

ACGGG6CCTG CCACCATACC CACGCCGAAA CAAGCGCTCA TGAGCCCGAA GTGGCGAGCC 3880 

CGATCTTCCC CATCGGTGAT GTCGGCGATA TAGGCGCCAG CAACCGCACC TGTGGCGCCG 3940 

GTGATGCCGG CCACGATGCG TCCGGCGTAG AGGATCTGGC TAGTTATTAA TAGTAATCAA 4000 

TTACGGGGTC ATTAGTTCAT AGCCCATATA TGGAGTTCCG CGTTACATAA CTTACGGTAA 4060 

ATGGCCCGCC TGGCTGACCG CCCAACGACC CCCGCCCATT GACGTCAATA ATGACGTATG 4120 

TTCCCATAGT AACGCCAATA GGGACTTTCC ATTGACGTCA ATGGGTGGAC TATTTACGGT 4180 

AAACTGCCCA CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTACGCCC CCTATTGACG 4240 

TCAATGACGG TAAATGGCCC GCCTGGCATT ATGCCCAGTA CATGACCTTA TGGGACTTTC 4300 

CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC CATGGTGATG CGGTTTTGGC 4360 

AGTACATCAA TGGGCGTGGA TAGCGGTTTG ACTCACGGGG ATTTCCAAGT CTCCACCCCA 4420 

TTGACGTCAA TGGGAGTTTG TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCGTA 4480 

ACAACTCCGC CCCATTGACG CAAATGGGCG GTAGGCGTGT ACGGT6GGAG GTCTATATAA 4540 

GCAGAGCTCT CTGGCTAACT AGAGAACCCA CTGCTTAACT GGCTTATCGA AATTAATACG 4600 

ACTCACTATA GGGAGACCC 4619 

(2) INFORMATION FOR SEQ ID N0:7: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 418 amino acids 

(B) TYPE: amino acid 
TOPOLOGY: linear 



(B) 
(D) 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:7:. 

Met Glu Pro Pro Gly Asp Trp Gly Pro Pro Pro Trp Arg Ser Thr Pro 
15 10 15 

Ara Thr Asp Val Leu Arg Leu Val Leu Tyr Leu Thr Phe Leu Gly Ala 
20 25 30 

Pro Cvs Tvr Ala Pro Ala Leu Pro Ser Cys Lys Glu Asp Glu Tyr Pro 
35 40 45 

Val Gly Ser Glu Cys Cys Pro Thr Cys Ser Pro Gly Tyr Arg Val Lys 
50 55 60 

Glu Ala Cys Gly Glu Leu Thr Gly Thr Val Cys Glu Pro Cys Pro Pro 
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75 



80 



Gly Thr Tyr lie Ala His Leu Asn Gly Leu Ser Lys Cys Leu Gin Cys 

85 90 95 

Gin Met Cys Asp Pro Ala Met Gly Leu Arg Ala Thr Arg Asn Cys Ser 
100 105 110 

Arg Thr Glu Asn Ala Val Cys Gly Cys Ser Pro Gly His Phe Cys lie 
115 120 125 

Val Gin Asp Gly Asp His Cys Ala Gly Ala Ala Val Thr Pro Pro Pro 
130 135 140 

Ala Ara Ala Arg Gly Cys Arg Arg Glu Ala Pro Arg Val Arg Thr Pro 
145 150 155 160 

Cys Val Arg Thr Ala Pro Gly Asp Leu Leu Ser Asn Gly Thr Leu Glu 
165 170 175 

Glu Cys Gin His Gin Thr Lys Cys Arg He His Lys Thr Val Ala Pro 
180 185 190 

Ser Thr Cys Ser Lys Pro Thr Cys Pro Pro Pro Glu Leu Leu Gly Gly 
195 200 205 

i Pro Ser Val Phe He Phe Pro Pro Lys Pro Lys Asp Thr Leu Met He 
210 215 220 

Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser Gin Asp 
; 225 230 235 240 

Asp Pro Glu Val Gin Phe Thr Trp Tyr He Asn Asn Glu Gin Val Arg 
245 250 255 

Thr Ala Arg Pro Pro Leu Arg Glu Gin Gin Phe Asn Ser Thr He Arg 
260 265 270 

1 Val Val Ser Thr Leu Pro He Thr His Gin Asp Trp Leu Arg Gly Lys 
275 280 285 

Glu Phe Lys Cys Lys Val His Asn Lys Ala Leu Pro Ala Pro lie Glu 
290 295 300 

Lys Thr He Ser Lys Ala Arg Gly Gin Pro Leu Glu Pro Lys Val Tyr 
305 . 310 315 320 

Thr Met Gly Pro Pro Arg Glu Glu Leu Ser Ser Arg Ser Val Ser Leu 
325 330 335 

Thr Cys Met He Asn Gly Phe Tyr Pro Ser Asp He Ser Val Glu Trp 
340 345 350 

Glu Lys Asn Gly Lys Ala Glu Asp Asn Tyr Lys Thr Thr Pro Ala Val 
355 360 365 

Leu Asp Ser Asp Gly Ser Tyr Phe Leu Tyr Asn Lys Leu Ser Val Pro 
370 375 380 

Thr Ser Glu Trp Gin Arg Gly Asp Val Phe Thr Cys Ser Val Met His 
385 390 395 400 

Glu Ala Leu His Asn His Tyr Thr Gin Lys Ser He Ser Arg Ser Pro 
405 410 415 

Gly Lys 
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